Noun Phrase Chunking in Hebrew: Influence of Lexical and Morphological Features
نویسندگان
چکیده
We present a method for Noun Phrase chunking in Hebrew. We show that the traditional definition of base-NPs as nonrecursive noun phrases does not apply in Hebrew, and propose an alternative definition of Simple NPs. We review syntactic properties of Hebrew related to noun phrases, which indicate that the task of Hebrew SimpleNP chunking is harder than base-NP chunking in English. As a confirmation, we apply methods known to work well for English to Hebrew data. These methods give low results (F from 76 to 86) in Hebrew. We then discuss our method, which applies SVM induction over lexical and morphological features. Morphological features improve the average precision by ~0.5%, recall by ~1%, and F-measure by ~0.75, resulting in a system with average performance of 93% precision, 93.4% recall and 93.2 Fmeasure.
منابع مشابه
رویکردی با ناظر در استخراج واژگان کلیدی اسناد فارسی با استفاده از زنجیرههای لغوی
Keywords are the main focal points of interest within a text, which intends to represent the principal concepts outlined in the document. Determining the keywords using traditional methods is a time consuming process and requires specialized knowledge of the subject. For the purposes of indexing the vast expanse of electronic documents, it is important to automate the keyword extraction task. S...
متن کاملA Lexical Resource of Hebrew Verb-Noun Multi-Word Expressions
A verb-noun Multi-Word Expression (MWE) is a combination of a verb and a noun with or without other words, in which the combination has a meaning different from the meaning of the words considered separately. In this paper, we present a new lexical resource of Hebrew Verb-Noun MWEs (VN-MWEs). The VN-MWEs of this resource were manually collected and annotated from five different web resources. I...
متن کاملDefiniteness in the Hebrew Noun Phrase
This paper suggests an analysis of Modern Hebrew noun phrases in the framework of HPSG. It focuses on the peculiar properties of the definite article, including the requirement for definiteness agreement among various elements in the noun phrase, definiteness inheritance in constructstate nominals, the fact that the article does not combine with constructs and the similarities between construct...
متن کاملSVM Model Tampering and Anchored Learning: A Case Study in Hebrew NP Chunking
We study the issue of porting a known NLP method to a language with little existing NLP resources, specifically Hebrew SVM-based chunking. We introduce two SVM-based methods – Model Tampering and Anchored Learning. These allow fine grained analysis of the learned SVM models, which provides guidance to identify errors in the training corpus, distinguish the role and interaction of lexical featur...
متن کاملDeeniteness in the Hebrew Noun Phrase
This paper suggests an analysis of Modern Hebrew noun phrases in the framework of HPSG. It focuses on the peculiar properties of the deenite article, including the requirement for deeniteness agreement among various elements in the noun phrase, deeniteness inheritance in construct-state nominals, the fact that the article does not combine with constructs and the similarities between construct-s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006